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Abstract 


We  present  a  relaxation  model  based  on  an  N-dimensional  Coulomb  po¬ 
tential.  The  model  has  arbitrarily  large  storage  capacity  and,  in  addition, 
well-defined  basins  of  attraction  about  stored  memory  states.  The  model 
is  compared  with  the  Hopfield  relaxation  model. 
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1  Introduction 

Equilibrium  associative  and  distributed  memories  that  are  content  address¬ 
able  and  have  the  ability  to  recall  stored  memories  more  or  less  imperfectly 
have  been  known  and  studied  for  years  [1],  [2],  [3],  [4],  [5].  At  the  same  time, 
relaxation  models  have  been  the  subject  of  much  exploration  [6].  In  1982, 
Hopfield  [7]  introduced  a  relaxation  model  of  memory  storage  and  retrieval, 
that  incorporates  simultaneously  a  distributed  memory  correlation  matrix 
and  a  relaxation  process  from  a  given  input  to  an  equilibrium  state.  Al¬ 
though  learning  procedures  can  be  included,  the  model  has  not  emphasized 
these.  Among  its  problems  are  poor  recall  of  stored  memories  when  the 
number  of  items  stored  exceeds  some  percentage  of  the  number  of  neurons 
involved. 

The  correlation  matrix  originally  employed  by  Hopfield  has  relatively 
weak  recall  properties  when  employed  as  an  equilibrium  distributed  memory. 
It  gives  perfect  recall  only  when  the  inputs  are  orthogonal.  When  the  inputs 
are  not  orthogonal,  one  can  still  achieve  perfect  recall  by  some  orthogonal 
modification  procedure  such  as  Widrow-Hoff  [8],  or  what  Kohonen  calls  an 
optimal  associative  mapping.  (9)  Such  procedures  work  if  the  number  of 
stored  memories  is  equal  to  or  smaller  than  the  dimension  of  the  system 
(the  number  of  synapses  on  each  neuron).  A  procedure  for  storing  as  many 
memories  as  desired  for  a  given  dimension  has  also  been  discussed [10],  In 
this  procedure  items  can  be  stored  at  arbitrary  points  on  a  hypersphere  with 
variable  regions  of  influence. 

In  this  paper  we  present  a  general  method  for  the  construction  of  a 
relaxation  memory  in  which  an  arbitrary  number  of  items  can  be  stored.  The 
essence  of  the  problem  is  to  define  a  function  whose  minima  lie  at  designated 
points,  corresponding  to  the  items  to  be  stored,  and  to  show  that  these  are 
the  only  minima  of  the  function.  Then  an  appropriate  relaxation  procedure 
is  defined,  so  that  any  entering  pattern  relaxes  to  one  of  the  stored  items. 

2  Hopfield’s  Model 

and  Some  Improvements 

In  the  Hopfield  model[7j,  neurons  are  binary-valued  threshold  units  and  are 
completely  interconnected,  with  the  strength  of  the  connections  given  by 
a  correlation  matrix  formed  from  the  memory  states  to  be  stored  in  the 
system: 
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“«  =  f (*> 

#=i 

where  w  =  ±1.  Input  states  are  relaxed  to  local  minima  of  a  Liapunov 
function, 

t=~2  (2) 

by  random,  asynchronoua  updating  of  the  neurons  in  the  layer  according  to: 

S 

W  —  20(£  "ijlh)  ~  1  (3) 

i= i 


In  its  original  form,  the  Hopfield  model  functions  poorly  as  a  categorizer 
when  ( m/N )  >  0.1,  where  m  =  the  number  of  stored  states  and  N  =  the 
number  of  neurons.  Given  the  limitations  of  the  original  model,  improve¬ 
ments  have  been  sought.  "Unlearning”,  an  approach  first  tried  by  Hopfield 
[11],  employs  the  relaxation  of  random  states  to  a  stable  state  (often  spuri¬ 
ous  attractors);  a  correlation  matrix  is  formed  from  the  relaxed  state,  and 
then  an  amount  proportional  to  this  is  subtracted  from  the  original  matrix: 


Wij 


wtj  -  a^tlattd^tastd 


(4) 


With  "unlearning”  the  number  of  stored  states  that  can  be  correctly  recalled 
approaches  the  dimensionality,  N,  and  error  correction  is  improved  but  falls 
to  zero  as  m  — ♦  W.[12] 

Recently,  an  interesting  variation  of  Hopfield ’s  "unlearning”  has  been 
studied  by  Potter. [12]  The  algorithm  is  a  hybrid  combining  elements  of 
Hopfield’s  "unlearning”  with  a  modification  reminiscent  of  the  Widrow-Hoff 
algorithm[8]: 

+ >>  ■  (s) 

The  symmetry  of  the  synaptic  matrix  is  preserved  by  making  the  same 
modification  to  uty  each  time  a  modification  is  performed  on  the  element  u>j; . 
In  simulations  for  which  all  of  the  input  states  at  a  radius  of  one  Hamming 
unit  from  each  stored  state  were  used  for  the  modification  procedure,  a 
radius  of  attraction  of  one  Hamming  unit  was  observed  for  m  just  below  the 
dimensionality,  N.  Above  the  dimensionality,  the  radius  of  attraction  and 
the  percentage  of  stable  stored  states  decays.  In  [13],  it  has  been  shown 
that  Potter’s  algorithm  may  be  viewed  as  an  ' effective  orthogonalization ’  of 
the  input  with  respect  to  the  nonlinear  relaxation  process;  a  more  complete 
discussion  of  Potter’s  algorithm  is  given  there. 
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3  High  Density  Storage  Model 

In  what  follows  we  present  a  general  method  for  the  construction  of  a  high 
storage  density  neural  memory.  We  define  a  function  with  an  arbitrary 
number  of  minima  that  lie  at  preassigned  points  and  define  an  appropriate 
relaxation  procedure. 

Let  be  a  set  of  m  arbitrary  distinct  memories  in  RN .  The 

“energy”  function  we  will  use  is: 

*  (6) 

»=1 

where  we  assume  throughout  that  N  >  3,  L  >  [N  -  2),  and  Qi  >  0  and 
use  |  •••  |  to  denote  the  Euclidean  distance.  Note  that  for  L=I,N=3,  £  is 
the  electrostatic  potential  induced  by  negative  fixed  particles  with  charges 
-Qi-  This  “energy”  function  possesses  global  minima  at  xj,...,xm  (where 
£(xi)  =  -oo)  and  has  no  local  minima  except  at  these  points.  A  rigorous 
proof  is  presented  in  Dembo  and  Zeitouni[l4]  together  with  the  complete 
characterization  of  functions  having  this  property. 

As  a  relaxation  procedure,  we  can  choose  any  dynamical  system  for  which 
£  is  strictly  decreasing.  In  this  instance,  the  theory  of  dynamical  systems 
guarantees  that  for  almost  any  initial  data,  the  trajectory  of  the  system 
converges  to  one  of  the  desired  points  x1,...,xm.  However,  to  give  concrete 
results  and  to  further  exploit  the  resemblance  to  electrostatics,  consider  the 
relaxation: 

=  -£<?,  |  fi-£i  (jZ-£»  (7) 

i=l 

where  for  N=3,L=1,  equation  (  7j  describes  the  motion  of  a  positive  test 
particle  in  the  electrostatic  field  E#  generated  by  the  negative  fixed  charges 

Qli1"!  Qm  ^ 

Since  the  field  Et j  is  just  minus  the  gradient  of  it  is  clear  that  along 
trajectories  of  (  7),  ^  <  0,  with  equality  only  at  the  fixed  points  of  (  7), 
which  are  exactly  the  stationary  points  of 

Therefore,  using  (  7}  as  the  relaxation  procedure,  we  can  conclude  that 
entering  at  any  the  system  converges  to  a  stationary  point  of  The 
space  of  inputs  is  partitioned  into  m  domains  of  attraction,  each  one  corre¬ 
sponding  to  a  different  memory,  and  the  boundaries  (a  set  of  measure  zero), 
on  which  jl(0)  will  converge  to  a  saddle  point  of 

We  can  now  explain  why  has  no  spurious  local  minima,  at  least  for 
L=1,N=3,  using  elementary  physical  arguments.  Suppose  £  has  a  spurious 
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local  minima  at  y  £i,...,xm,  then  in  a  small  neighborhood  of  y  which  does 
not  include  any  of  the  z,,  the  held  E, j  points  towards  y.  Thus,  on  any  closed 
surface  in  that  neighborhood,  the  integral  of  the  normal  inward  component 
of  En  is  positive.  However,  this  integral  is  just  the  total  charge  included 
inside  the  surface,  which  ia  zero.  Thus  we  arrive  at  a  contradiction,  so  y  can 
not  be  a  local  minimum. 

We  now  have  a  relaxation  procedure,  such  that  almost  any  /Z(0)  is  at¬ 
tracted  by  one  of  the  £,,  but  we  have  not  yet  specified  the  shapes  of  the 
basins  of  attraction.  By  varying  the  charges  Qi,  we  can  enlarge  one  basin 
of  attraction  at  the  expense  of  the  others  (and  vice  versa). 

Even  when  all  of  the  Q,  are  equal,  the  position  of  the  z<  might  cause 
p(0)  not  to  converge  to  the  closest  memory,  as  emphasized  in  the  example 
in  fig  1.  However,  let  r  =  |  z,  -  zy  |  be  the  minimal  distance 


Figure  1:  p(Q)  closer  to  x\  but  converges  to  £2,  due  to  the  existence  of  z 3 
(assuming  R  1  and  S  1). 


between  any  two  memories;  then,  if  |  {2(0)  -  z<  |<  — r-r-,  it  can  be  shown 

(1+3* ) 

that  p(0)  will  converge  to  £,  provided  that  (fc  =  >  1)-  Thus,  if  the 

memories  are  densely  packed  in  a  hypersphere,  by  choosing  k  large  enough 
(i.e.  enlarging  the  parameter  L),  convergence  to  the  closest  memory  for  any 
“interesting”  input,  that  is  an  input  p(0)  with  a  distinctive  closest  memory, 
is  guaranteed. 

The  detailed  proof  of  the  above  property  is  given  in  [14].  It  is  based  on 
bounding  the  number  of  zy,  j  «,  in  a  hypersphere  of  radius  R  (R  >  r) 
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around  x,,  by  [2^  -I-  1]^,  then  bounding  the  magnitude  of  the  field  induced 
by  any  xy,  j  ^  t,  on  the  boundary  of  such  a  hypersphere  by 
( R -  |  fi(0 )— x,-  |)-(i+1)j  and  finally  integrating  to  show  that  for  |  /T(0)-x,  |< 
•*r~I ,  with  0  <  1,  the  convergence  of  /Z(0)  to  x<  is  within  finite  time  T,  which 

l'fS  * 

behaves  like  9L+ 2  for  L  »  1  and  0  <  1  and  fixed.  Intuitively  the  reason  for 
this  behaviour  is  the  short-range  nature  of  the  fields  used  in  equation  (  7). 
Because  of  this,  we  also  expect  extremely  low  convergence  rate  for  inputs 
/T(0)  far  away  from  all  of  the  x,-. 

The  radial  nature  of  these  fields  suggests  a  way  to  overcome  this  diffi¬ 
culty,  that  is  to  increase  the  convergence  rate  from  points  very  far  away,  with¬ 
out  disturbing  all  of  the  aforementioned  desirable  properties  of  the  model. 
Assume  that  we  know  in  advance  that  all  of  the  x,-  lie  inside  some  large 
hypersphere  S  around  the  origin.  Then,  at  any  point  {1  outside  S,  the  field 
Eji  has  a  positive  projection  radially  into  S.  By  adding  a  long-range  force 
to  Efi,  effective  only  outside  of  S,  we  can  hasten  the  movement  towards  S, 
from  points  far  away,  without  creating  additional  minima  inside  of  S.  As  an 
example  the  force  (—/I  for  \x  £  S]  0  for  €  S)  will  pull  any  test  input  /!( 0) 
to  the  boundary  of  S  within  the  small  finite  time  j|j,  and  from  then  on 

the  system  will  behave  inside  S  according  to  the  original  field  Eji. 

Up  to  this  point,  our  derivations  have  been  for  a  continuous  system,  but 
from  it,  we  can  deduce  a  discrete  system.  We  shall  do  this  mainly  for  a 
clearer  comparison  between  our  high  density  memory  model  and  the  dis¬ 
crete  version  of  Hopfield’s  model.  Before  continuing  in  that  direction,  note 
that  our  continuous  system  has  unlimited  storage  capacity  unlike  Hopfield’s 
continuous  system  [15],  which  like  his  discrete  model,  has  limited  capacity. 

For  the  discrete  system,  assume  that  the  x,-  are  composed  of  elements 
±1  and  replace  the  Euclidean  distance  in  (  6)  with  the  normalized  Hamming 

distance  |  —  fa  |  =  jf  |  fij  —  ft*  \.  This  places  the  vectors  xj  on  the 

unit  hypersphere. 

The  relaxation  process  for  the  discrete  system  will  be  of  the  type  defined 
in  Hopfield’s  model  in  equation(  3).  Choose  at  random  a  component  to  be 
updated  (that  is,  a  neighbor  of  such  that  |  -  ft  |=  £),  calculate 
the  “energy”  difference,  6£  =  £(/?)  -  £(£),  and  only  if  5$  <  0,  change  this 
component,  that  is: 

(8) 

where  £(/T)  is  the  potential  energy  in  (6).  Since  there  is  a  finite  number  of 
possible  p  vectors  (2N),  convergence  in  finite  time  is  guaranteed. 
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This  relaxation  procedure  is  rigid  since  the  movement  is  limited  to  points 
with  components  ±1.  Therefore,  although  the  local  minima  of  £(/ 1 )  de¬ 
fined  in  (  6)  are  only  at  the  desired  points  z,,  the  relaxation  may  get  stuck 
at  some  jl  which  is  not  a  stationary  point  of  £(/T).  However,  the  short- 
range  behaviour  of  the  potential  £(/I),  unlike  the  long-range  behaviour  of 
the  quadratic  potential  used  by  Hopfield  (equation  (  2)),  gives  rise  to  results 
similar  to  those  we  have  quoted  for  the  continuous  model  (equation  (7)). 

Specifically,  let  the  stored  memories  xi,...,xm  be  separated  from  one 
another  by  having  at  least  pN  different  components  (0  <  p  <  i  and  p  fixed), 
and  let  £(0)  agree  up  to  at  least  one  zt  with  at  most  OpN  errors  between 
them  (0  <  6  <  1/2,  with  8  fixed),  then  /Z(0)  converges  monotonically  to 
by  the  relaxation  procedure  given  in  equation  (  8). 

This  result  holds  independently  of  m,  provided  that  N  is  large  enough 
(typically,  Np  ln(^j^)  >  1)  and  L  is  chosen  so  that  ^  <  ln(^y^).  The  proof 
is  constructed  by  bounding  the  cummulative  effect  of  terms  |  jl  —  z,  I-1', 
j  -£■  «,  to  the  energy  difference  and  showing  that  it  is  dominated  by 
|  fi  —  Xi  |-£\  For  details,  we  refer  the  reader  again  to  [14]. 

Note  the  importance  of  this  property:  unlike  the  Hopfield  model  which 
is  limited  to  m  <  IV,  the  suggested  system  is  optimal  in  the  sense  of  In¬ 
formation  Theory,  since  for  every  set  of  memories  Xi,...,xm  separated  from 
each  other  by  a  Hamming  distance  pN,  up  to  \pN  errors  in  the  input  can 
be  corrected,  provided  that  N  is  large  and  L  properly  chosen. 

As  for  the  complexity  of  the  system,  we  note  that  the  nonlinear  operation 
a~ L,  for  a  >  0  and  L  integer  (which  is  at  the  heart  of  our  system  computa¬ 
tionally)  is  equivalent  to  e~Lln (“)  and  can  be  implemented,  therefore,  by  a 
simple  electrical  circuit  composed  of  diodes,  which  have  exponential  input- 
output  characteristics,  and  resistors,  which  can  carry  out  the  necessary 
multiplications. 

Further,  since  both  j  x<  |  and  |  |  are  held  fixed  in  the  discrete  system, 

where  all  states  are  on  the  unit  hypersphere,  |  jl  -  £,■  |2  is  equivalent  to  the 
inner  product  of  jl  and  z,-,  up  to  a  constant. 

To  conclude,  the  suggested  model  involves  about  m  •  N  multiplications, 
followed  by  m  nonlinear  operations,  and  then  m  •  N  additions.  The  original 
model  of  Hopfield  involves  N2  multiplications  and  additions,  and  then  N 
nonlinear  operations,  but  is  limited  to  m  <  N.  Therefore,  whenever  the 
Hopfield  model  is  applicable  the  complexity  of  both  models  is  comparable. 
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